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Abstract 

Risk Limiting Dispatch (RLD) was proposed recently as a mechanism that utilizes information and 
market recourse to reduce reserve capacity requirements, emissions and achieve other system operator 
objectives. It induces a set of simple dispatch rules that can be easily embedded into the existing 
dispatch systems to provide computationally efficient and reliable decisions. Storage is emerging as an 
alternative to mitigate the uncertainty in the grid. This paper extends the RLD framework to incorporate 
fast-ramping storage. It developed a closed form threshold rule for the optimal stochastic dispatch 
incorporating a sequence of markets and real-time information. An efficient algorithm to evaluate the 
thresholds is developed based on analysis of the optimal storage operation. Simple approximations that 
rely on continuous-time approximations of the solution for the discrete time control problem are also 
studied. The benefits of storage with respect to prediction quality and storage capacity are examined, and 
the overall effect on dispatch is quantified. Numerical experiments illustrate the proposed procedures. 
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I. Introduction 

The increased penetration of renewable generation in the grid increases the uncertainty that 
needs to be managed by the system operator [IJ. Existing system dispatch procedures are 
designed assuming mild uncertainty, and utilize a worst-case schedule for generation by solving 
deterministic controls where random demands are replaced by their forecasted values added to 
'3(t' All, where a is the standard deviation of forecast error. Such rules require excess reserve 
capacity and result in increased emissions if a is large [[31, flU. 

An approach to mitigating the impact of increased uncertainty is to utilize energy storage 
[[51 . Energy storage can be broadly classified into two groups depending on their scheduling 
characteristics [[6l, [[3: fast storage and slow storage. Fast storage can be utilized to mitigate 
intra-hourly variability of renewable supply. Slow storage can be utilized to transfer energy 
between different hours, so excess production can match peak demands. In this paper we address 
the integration of fast storage into power system dispatch. 

Existing approaches to stochastic dispatch rely on stochastic programming based on techniques 
like Monte Carlo scenario sampling that are hard to implement in large scale or do not incorporate 
market recourse [[8l- [[T3l . Moreover, the optimal decisions can be difficult to interpret in the con- 
text of system operations. Recent work has proposed utilizing robust optimization formulations 
with uncertainty sets [[T4l . [[TSll . but they do not capture multiple recourse opportunities and 
can result in conservative dispatch decisions. Incorporating storage into these models results in 
additional complexity and decisions which are hard to analyze. Risk Limiting Dispatch (RLD) 
[[T6l was proposed as an alternative to capture multiple operating goals and provide reliable and 
interpretable dispatch controls that can be readily incorporated in existing dispatch software. RLD 
incorporates real-time forecast statistics and recourse opportunities enabling the evaluation of 
improvements in forecasting and market design [41 . In this paper we develop RLD to incorporate 
fast storage. 

Fast storage integration with renewables has been studied in a variety of scenarios, [fill 
examines the benefits of storage for renewable integration in a single bus model. Optimal 
contracting for co-located storage was studied in [[TSl . [[T9l , and the role of distributed storage was 
studied in [|20l . [[271 . Recent independent work |[22l addresses system operator (SO) dispatch 
of storage to mitigate net loads (scheduled load minus wind) to obtain analytic controls and 
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expressions for the value of storage, but focusing only in the real-time dispatch. We consider a SO 
dispatch process that utilizes a market with multiple recourse opportunities to evaluate the value 
of improved forecasts and impact of storage. An easy-to-compute numerical dispatch algorithm is 
developed utilizing optimal control structural results and explicit formulae. Analytic relationships 
between key quantities of interest are derived based on a continuous-time approximation of the 
storage problem. 

The remainder of the paper is organized as follows. Section HI] states RLD with storage problem. 
Section |lll] establishes structural control results for dispatch. The optimal storage operation and 
evaluation of the dispatch thresholds are studies in Section |IVl An approximation scheme is 
derived in Section |Vl Numerical results are presented in Section |Vll Section IVIII concludes the 
paper. 

II. Problem Statement 

Grid operation is constrained so that supply must equal demand at each time instant. The 
system operator (SO) schedules conventional generation in a sequence of markets ahead of 
delivery time to ensure this constraint is met. Load and renewables are random, and only revealed 
at the delivery time interval. The goal of the operator is to find the optimal schedule and operation 
of grid resources given information about load and renewable generation. Fast storage can be 
used to smooth unpredicted variations of the net load (load minus renewables) in real-time. 




Fig. 1: Risk limiting dispatch with storage: The purchased conventional generation is supplied 
over the delivery time interval uniformly, while the wind generation and demand may vary over 
the time interval. A storage device is operated during the delivery time interval to minimize the 
terminal cost (penalty). 
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A. Model Formulation 

There are R markets, modeled as stages, ahead of the delivery time interval. For example, 
stage 1 may occur 24 hours ahead of the delivery time; stage R occurs 15 minutes ahead of 
the delivery time; other stages occur in between. At each stage, an operator makes a decision 
to purchase Sr units of energy at stage r for the delivery time interval. Note Sr is a forward 
contract and may be interpreted as a contract for reserve capacity. The price Cr per unit of Sr 
is known in advance. These reserve capacities are different in two respects: the r-th capacity 
must be available in shorter time than the (r — l)-th capacity, and their prices are different. The 
operator may also sell reserve capacity at various stages. In such case, Sr < 0. We denote the 
index set for all dispatch stages as TZ — {1, 2, . . . , i?}, and use r to denote one of the stages in 
TZ. Note the restriction of whether buying or selling is allowed at each stage is given ahead of 
time. 

Further, to avoid trivial solutions, some constraints are imposed on the prices. For two buying 
dispatch stages ri, r2 G TZ and ri < r2, we require < < Cr^, i.e. price of purchasing 
power increases as the delivery deadline approaches. If Cr-^ > Cr^, it is worthwhile to defer the 
purchasing decision since more information is available at stage r2 when the price is lower. 
Similarly, for two selling dispatch stages ri, r2 E TZ and ri < r2, we require > Cr^. Finally, 
to avoid arbitrage, for each buying stage ri and selling stage r2, such that ri < r2, we require 

At each dispatch stage r eTZ, three events occur. First information YJ. is observed. In addition 
to the state at stage r, i.e., Xr G Yr, the information set could also contain signals that help the 
prediction of the demand and wind generation. Examples include weather forecast and sensor 
measurement data that are available at the time of stage r. Notice Yj. c Yj.+i, for all r e TZ. 
Next a dispatch decision is made: The operator decides to purchase (sr > 0) or sell (sr < 0) 
from the r-th market. Lastly the total amount of power accumulated so far is computed 

Xr+l — Xr + Sr, r G TZ. (1) 

The energy accumulated in R markets is supplied during a delivery time interval which is 

discretized into T stages. Let T := {R + 1, R + 2, . . . , R + T} and use t to denote each element 
in T. For each stage in the delivery time interval, the amount of energy supply from conventional 
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generation is x = xr^i/T. 

A random wind generation Wt and a random load Lt are realized for each t E T. Let 
Dt := Lt — Wt denote the net deficit at stage t. The deficit may be positive or negative. A 
monetary penalty is assessed to compensate the positive deficit or unmet demand. Different 
forms of the penalty will be discussed later in the section. The information set is extended to 
stage R + T, i.e., for alH e {2, . . . ,R,R + I, . . . R + T} we have Yi^i C Yi. We also define 
as the information available at the end of the entire period and Yr+t C Yr+t+i- 

In each stage t E T, the storage operator can recharge [ut]+ or discharge units of 

energy subject to physical constraints of the storage device, where [ut]+ = max(ut,0), [ut]- = 
mm{ut, 0), and ut = [ut]+ + [ut]- is the variable representing the operation for storage at stage 
t. We denote the amount of energy stored in the storage device at stage t as bt, the transition 
function for the storage device as F{bt,Ut), and the feasible set for the discharging/recharging 
operations as U{ht). We denote the terminal cost as g{T),u.,x), which will be specified in 
Subsection III-Cl where D = {Dr+i, Dr+2, • • • , ^r+t) and u = mr+2, • • • , ur+t)- A 

control policy = (0i, 02, • • • , 0/j+t) is a sequence of functions each of which maps the 
information available at current stage to the action at the same stage, i.e., (pi is a Fj-adapted 
function for every i E TZU T. We use 0^ := (0i, 02, ... , 0^^) to represent the dispatch policy 
and 0'^ := (0r+i, 0ij+2, • • • , 4>r+t) to represent the storage operation policy. 

The RLD with storage problem can then be summarized as 




(2a) 



subject to Sr = 4>r(Xr) 



(2b) 



(2c) 



Ut = (t>t(Yt), 



(2d) 



Ut E U{bt), 



(2e) 



bt+i=F{bt,ut), tET. 



(2f) 
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B. Storage 



The storage device has finite energy capacity B. Energy loss between stages is modeled using 
the storage efficiency A G [0, 1], so the contribution of bt units of stored energy at time t + 1 is 
only Xbf. Energy conversion loss is modeled with parameters ^ E [0, 1] and u E [0, 1], denoting 
the recharging efficiency and discharging efficiency, respectively. If u units of electric energy is 
recharged into the storage device, only u' = fiu units of energy will be actually stored due to 
energy conversion loss. Similarly, if u' units of electric energy is discharged from the storage 
device, only u = uu' unit of the energy can be used to meet the net deficit realized. 

Typically, storage models also consider ramping-rate constraints on charging and discharging 
[|23l . Fast response grid level storage is rapidly becoming available with power to energy ratios 
of 40 to 50 kW per kWh utilizing Advanced Lithium-Ion blocks . A full charge or discharge of 1 
kWh can be obtained in about 1.2 to 1.5 minutes of response time. If the dispatch discretization 
interval considered is larger than this, the ramping constraints will not be active during operation. 
Although some of our results can be generalized to cases with charge constraints {e.g., [fTSl . [|22l. 
[|24|). focusing on a simpler model reveals deeper insight about the solution. In ongoing work, 
we are devising a model for slow storage, which needs to account for the existence of multiple 
markets (each with multiple storage operation stages) with different timing constraints. 

The dynamics of the storage model is captured by the transition function 



The feasible set for the vector u is denoted as lA{h). We assume that the storage starts empty, 
i.e., = 0, and any energy remained in the storage after stage R + T will be discarded at 
no cost/benefit consistent with operating policies for fast storage. 

C. Terminal Costs 
Different terminal costs lead to different dispatch goals: 




and the feasible action set is 
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Value of loss load (VOLL). Let 

gt{Dt, ut, x) = c[Dt -x + ut]+ 

measure the shortfall to meet Dt when x units of power is supplied and —Ut units of energy are 
withdrawn from the storage at stage t, where c is the cost of unit shortfall. The total terminal 
cost over the delivery time interval then is 

5((D, u, x) = c -x + ut]+. 

teT 

Notice gt{Dt, ut, x) — c[Dt — x + ut\+ is convex in x, for all values of Dt and Uf. 

Loss of load probability (LOLP). LOLP is in general defined as the probability of allocated 
supply not meeting the random deficit at the delivery time. For our model with a finite-length 
delivery time interval and storage, one way to define LOLP is 

l[F{Dt + Ut> x\Yt) <a, 
teT 

Here we use another definition which induces simple dispatch rule: 

F{Dt + Ut>x\Yt)<au "iter, 

where at is the allowed LOLP at stage teT which could be related to the allowed LOLP for 

the entire delivery time interval with e.g. cit = a^/^. 

A direct setup that achieves this goal is to use extended function definitions: 

, f if F{Dt + Ut> x\Yt) < at, e T, 
g(D,u,x) = < 

I oo otherwise. 

Notice since the set 

{{u,x)\F{Dt + Ut>x\Yt) <at, e T} 
is convex, 5((D,u, a;) is convex in (u, x). 

Frequency drop charge. In some scenarios, it is desirable to charge for the frequency devia- 
tions caused by unmet demand or excessive generation. A common assumption valid for small 
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deviations of net demand is that the frequency deviation is linearly related to unmet demand, 
i.e. A/ = a{D — xr+i). If the deviation costs $c per MW, then for a stage t 

gt{Dt,ut,x) = ca\Dt - x + Ut\. 

In this case g is also convex, but is not non-increasing. For all stages in the delivery time interval, 
we have 

5f(D, u,x) = ca^{[Dt - X + Ut]+ - [Dt - x + Ut]-) . 

III. Structural Result for Dispatch Control 

The structure of the optimal dispatch depends on determining the optimal generation schedul- 
ing assuming that storage is operated optimally given a generator schedule. Based on the later 
assumption, a standard dynamic programming result reveals: 

Lemma III.l. The cost-to-go function for a dispatch stage r eTZU {R + 1} z^' 

jR+iixR+i)= inf E{g{'D,u,x)\YR+i} , 

uGW(b) 

Jr{Xr) = inf E{CrSr + Jr+liXr+l)\Yr} , T G TZ, (3) 

where S,- = {s\s > 0} if r is a buying stage, Sr = {s\s < 0} if r is a selling stage. Further, if 
s* = (p*(Yr) minimizes the right hand side of ^ for each Xr and r, then the dispatch policy 
(f)^* = (0*, ... , is optimal. 

Lemma IIII.ll states that the policy minimizing cost-to-go functions is optimal. Note the per- 
stage terminal cost function gt{Dt,Ut, x) is jointly-convex in both ut and x. It follows that 
cost-to-go function for each stage r E TZU {R + 1} is convex: 

Proposition IIL2. The cost-to-go function Jr{xr), for all r E TZU {R + 1} is convex in Xr given 
g(D, u, x) convex in u and x. 

Remark IIL3. Proposition 1/77. 2 1 is not restricted to the case of constant prices for dispatch. 
In fact, the convexity of the cost-to-go extends to the case where the price can depend on the 
dispatch, i.e., Cr = Cr{sr), as long as Cr{-) is a convex function for all r ElZ. 
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Based on these observations and principles from inventory control, the structure of the optimal 
control can be computed. This structure depends on the gradient of the cost function. When cost 
functions are not differentiable (with respect to x^), we use the notion of constrained subgradient 
denoted as V in the sequel. Relying on Lemma IIII.ll and Proposition 1111.21 we are ready to give 
the main result of this section: 

Theorem III.4. For each dispatch stage r eTZ, the optimal dispatch is 

{[%l)r — Xr]+ if r is a buying stage, 
(4) 
[^r — Xr]- if r is a selling stage, 



Theorem IIII.4I shows the optimal dispatch is characterized by a sequence of thresholds ipr 
for r E TZ. This important practical feature of RLD [25] generalizes to the case of storage 
with convex terminal costs. However, in RLD without storage thresholds can be precomputed 
given the probability structure of the net demand D conditional on the information set Yr. In 
the present case this is not possible in general as the net demand follows a stochastic process 
Dt, t E T instead of a single random variable representing the total net demand in the period. 
Moreover, the computation of the constrained subgradient of the terminal cost function coupled 
with minimization over feasible storage operations may not be analytically tractable. 



The threshold structure derived in the Section UlI] is valid for any choice of convex terminal 
cost function, but the actual threshold computation depends on the particular cost choice. We 
focus on the VOLL cost in the reminder of the paper, but the analysis can be generalized to 
other costs in section IH-CI For the VOLL cost, the optimal control problem is 



where ipr ^ is a state independent variable that satisfies 



Or + VJr+l{lpr) = 0, 



with J, 



r+i(a;) = E[Jr+i(x)|Fr]- Thus ipr is uniquely defined as ipr = V j,.^i{—Cr). 



IV. Storage Operation and Threshold Computation 




subject to (120), dJc]), (Ed]), (EB- 



9 



Section UlI] solved this problem assuming storage is operated optimally. The remainder of the 
section derives a more explicit optimal control rule for storage under the VOLL cost. Based on 
it, an efficient algorithm for the constrained subgradient of the terminal cost-to-go function is 
developed, which simplifies the computation of the dispatch thresholds significantly. 

A. Storage Control 

The optimal storage operation problem, given xr^i units of energy accumulated in R markets. 



IS 



minimize E 



c'^lDt - X + ut] 

. teT 



subject to (Edl), (EB). 
The storage operation subproblem is again solved with dynamic programming. 

Lemma IV.l (Optimal storage operation). The terminal cost-to-go function is 

and the cost-to-go for a storage operation stage t E T is 

Jt{x,bt) = inf E{c[Dt - X + ut]+ + Jt+i{x,bt+i)\Yt} . 

ut&A{bt) 

The cost-to-go function for each t G TU {R + T + 1} is convex in (a;, bt). For t E T, the optimal 
control policy for the storage operation is 

ulipt) = min |[x - j^{B - - min{[A - x] + , ubt} , 

or equivalently in terms of recharging and discharging 

= min I [x - + , hs-h)^, K(&t)]- = - min {[A - z/fej . 

B. Threshold Computation 

The threshold can be computed combining analytic and algorithm approaches. Without loss 
of generality, we focus on the case of ideal storage (A = /i = z/ = l)to simplify the notation. 
First a simple consequence of Lemma IIV.II gives a recursive formula for the expected total cost 
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over all the storage operation stages (i.e., the terminal cost-to-go for dispatch after generators 
are scheduled): 



Corollary TV.2 (Terminal cost-to-go). The expected cost-to-go function at stage R + 1 given the 
information at a dispatch stage r is 



with = = and a Ab = min(a, b), for t G T. 

Eq. ^ can be combined with prior results in ll25l to obtain the cost-to-go of other dispatch 
stages due to linearity of the cost structure. Also notice ([5]) gives a Monte Carlo based algorithm 
to evaluate the dispatch thresholds. In the reminder of this section, a closer investigation of the 
terminal cost-to-go and its subgradient is presented to devise more efficient algorithm for the 
threshold evaluation. 

For this purpose, we first classify the states of the storage into three cases: Empty (bt = 0), full 
(bt = B) and strictly in between (0 < 6^ < 5). Then the proposed method works by calculating 
the probability of all possible sequences of states of the storage device. As an illustration. Fig. |2l 
depicts the tree of possible storage states for the case T = 3. Levels of the tree correspond to 
the storage operation stages, and nodes of the tree represent the state of storage device. Note the 
probability of visiting each node in the tree can be easily computed analytically. However, this 
does not lead to a practical algorithm due to the curse of dimensionality. Because the number of 
nodes in each level of the tree grows exponentially as t increases. In order to obtain a polynomial 
time algorithm, we introduce algebraic recombinant lattice, an algebraic analog to recombinant 
(or recombining) lattice, a technique widely used in finance applications [26] and introduced 
to power systems and control community by e.g. [|27l . In recombinant lattice model, the lattice 
{i.e., discretized state space) of dynamic programming has combined lattice points whenever two 
lattice points represents numeric values that are close enough, so that the growth of the state 
space is linear. In a similar spririt, we combine the lattice points based on the algebraic forms in 




(5) 



where x = xrj^i/T and the optimal storage level b* is defined recursively as 



b*^, = F,{bt,ut) = BA[x-Dt + b*U, 



(6) 
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Fig. 2: Scenario tree for storage operation: T = 3 example. The nodes depict the state of the 
storage device. A filled, half-filled, and unfilled node represents the case where the storage is 
full, between full and empty, and empty after the optimal control at the corresponding stage, 
respectively. The tree grows exponentially. 



or equivalently the algebraic form of effective deficit, which is defined to be the difference 
between the realized deficit and the storage level for a particular case (node in Fig. [2l). 

Proposition IV.3 (State space decomposition). At stage t E T, there are Kt = 2(t — i?) — 1 

algebraic forms of effective deficit D^, which are defined recursively as 



Dt 

Dt-B 

where /c G /C^ = {1, . . . , fCf}, or equivalently 



X 



ifk = l, 

ifke}Ct\{l,Kt}, 
ifk = Kt, 

ifl<k<t-R, 



EiV' A-i ~{Kt~k)x~B ift-R<k< Kt. 
The indicator of the event for each particular case to happen is 

j:ieK._,p\-iHDU>x} ifk = i, 



Pt 



p1iI1{x-B < D^tZl < x} 



ifke}Ct\{l,Kt}, 



EieK,_M-iHDl, <x-B} ifk = Kt, 
with p\i_^_i = 1. The probability of each of these events condition on information available at a 
dispatch stage r is E [p^ |i^r] ■ 
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The recombinant lattice based on the algebraic form of effective deficit D'l is depicted in 
Fig. |3l We will also refer the k-ih. node on the t-th level (corresponding to in the figure) 
node (t, k). Notice only the nodes originally corresponding to cases where the storage is empty 
or full are combined since they share the same expression of D^. Now the number of nodes 
in each level grows linearly as a function of t, as desired. Furthermore, the recursive definition 
of the indicator of visiting each node characterizes the condition for each case. The expected 
terminal cost-to-go and its subgradient with respect to x can then be expressed in terms of the 
effective deficit D'l and indicators . 




Fig. 3: Evolution of effective deficit D'^: T = 3 illustration. For each node its left and 
right children represents the case where starting with the effective deficit D'l, the storage ends 
up empty and full after optimal storage operation at stage t + 1, respectively. The middle child 
represents the case that the storage level is strictly between and B. 



Theorem IV.4 (Terminal cost-to-go and subgradient). The expected cost-to-go function at stage 
R + 1 given the information at a dispatch stage r is 



Jr+i{xr+i) =cE 



=cE 



EE 



(7) 



E E E t^' - < < (D,^),} 



teT keJCt 



where = {p\_^^,D\^^,D\^^,D 



i?+2' • • • ' 



D\, . . . , Df^^, (Dj )j and columns of 
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matrix G M^*"^) ''l-^' l and G M^*--^^) ''l^* l are c/zo^en ^mc/j that 1{D^ > x}p^ = 

Ylijf^M'' < < (Dj)j}, each element in Af^ representing a path from the root 

node to node in the lattice. 

The constrained sub gradient for the cost-to-go function at stage R+1 given the information 
at a dispatch stage r is 



[kA{K, + l~k)]E 1{(D^), < D, < (D,^),} 



Yr 



1=1 



dx 



1{(D,). = (D,^),,} 



dx 



1{(D,), = (Dn„} 



1{(D,^)_,, < (D,)_, < (D,^)_„.} 



Y. 



where (Df)j, (D^)jj and (D^)j j are i-th entries of column vector D^, (Dt)j '^^^ (Df)j, 
respectively; (Dj)_j, (D^ )_j j anJ (D^)_i_j are the remaining parts of the corresponding vectors. 



C. Gaussian Dispatch 

Theorem IIV.4I works for general deficit processes. In practice, information about the net load 
is given by forecasts of load and wind and the expected variance of these forecasts. Utilizing 
the predicted deficit, the dispatch can be simplified by considering the forecast error random 
variable of the net load. The prediction errors of the deficit process can be assumed to be 
Gaussian random variables as observed in various studies {e.g. [[22|. [|25l . iflSl X In particular 
we consider the following form of the forecast 



Dt = Dt{Yr) + tt{Y,), yter, 



(8) 



where et(Yr) ~ N (0, af(Yr)) is independently distributed for each t. For each dispatch stage 
r E TZ, the forecast DtiYr) and forecast error et(Yr) depend on the information Yr. A typical 
pattern of this dependence is that as the delivery time approaches, the variance in prediction 
error decreases due to the accumulated information that is collected by e.g. wind speed sensors 
around the wind farm. This dependence is captured by inputting different variance values for 
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the prediction error at different dispatch stages. Note for each fixed t G T, e((Fr) may not be 
independent across dispatch stages (for different values of index r). Since the calculation in the 
reminder of this paper applies to each dispatch stage, we write and Dt directly when index r 
is clear from the context, and omit the dependence on F,. to simplify the notation. 

The independence of prediction errors simplifies the evaluation of the probability of visiting 
each node. We now give the updated version of Proposition IIV.3I 

Proposition IV.5 (State space decomposition: Gaussian errors). With forecast model ([8]), the 
predicted ejfective deficit Df is 



Eti Dt-i -ik-l)x ifl<k<t-R 



= { — ' " (9) 



EiV' Dt-i -{Kt-k)x-B ift-R<k<Kt, 
and the prediction error in the effective deficit is 



h 



(10) 



j=0 



where h = {k — 1) A {Kt — k) is the depth of the node {t, k) from the closest boundary nodes. 

For the state corresponding to node {t, k) in the algebraic recombinant lattice, denote the 
probability to visit the node as p^, visit the node and move to its left child as p^, visit the 
node and move to its middle child as p^, and visit the node and move to its right child as . 
Following recursion holds for these quantities. Starting with = 1, 



4- 



pt^ ifke}Ct\{l,Kt}, (11) 



EieiCt-iPt-i ifk = Kt, 
J = pttP (Eg < Et-i^ < WI, > ^) , (12) 
^ = pttP (M < < W) , (13) 
^ = pttP (e^I < E^-/ < E^, el < ef) , (14) 
where el = x-B-D^^ = x-D^, = (ett , ett+^i', ...,elf,El= (eg, egg, . . . , 6^ 
anJEF=(S55?,---,^)^- 

15 



Given that is independently distributed zero-mean Gaussian, is also zero-mean Gaussian, 
whose variance can be easily computed. It follows that is a zero-mean multivariate Gaussian 
and its distribution function can be evaluated provided the its covariance matrix which again is 
available from the definition of e^. Proposition IIV.5I allows us to evaluate the expected terminal 
cost-to-go and its subgradient explicitly. 

Lemma IV.6 (Terminal cost-to-go and subgradient: Gaussian errors). The expected terminal cost 
is 

( 



E: 



1 1 



v 





5 


Et"/ 


\ 


— X 






oo 


I 


h+1 



where jj, (X; X_, X) is the mean vector of the truncated Gaussian with mean and variance equal 
to that of X, and truncation interval [X, X) . Here the second term in the bracket is the mean 
of the last entry of Ej within the corresponding interval. 
The expected constrained subgradient is 



VJ«+i(x«+i) = -- ^ ^ [fc A (ir, + 1 - k)] pI (15) 

Relying only on the evaluation of Gaussian distribution function, Proposition |IV.5| and Lemma |TV.6l 
give an analytical tractable approach to calculate the expected total cost for the delivery interval 
with storage operation. It provides an efficient approach to compute the dispatch thresholds for 
prediction model ([8]). We also note this result is of interest in other applications of energy storage, 
where benefits of storage need to be quantified as cost-saving over a finite horizon. We can in 
fact analyze several practical special cases of model ([8]) and point out cases where the thresholds 
can be computed off-line: 
1) at = a. In this case, the prediction errors are i.i.d. Gaussian random variables. This type 
of prediction models are typically favored by power engineers because there may not be 
enough data to estimate different variances for the prediction error at different storage 
operation stages. Further, this results in a simpler implementation for the calculation. Note 
in terms of the analytic derivation, this assumption does not lead to further simplification. 
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2) Dt = D.ln this case, the only forecast available is one nominal value for the deficit over the 
entire delivery time interval. This simplifies the form of in Equation From a practical 
perspective, this assumption allows the computation of the thresholds to be conducted off- 
line. This simplifies the dispatch procedure tremendously, so that stochastic dispatch may 
be carried out in a similar fashion as conventional deterministic dispatch. 

3) Dt = D and ot = o. This is the simplest model in which thresholds can be computed 
off-line. It also requires extremely few data to estimate model parameters. However, this 
model may be too simple to represent the fluctuation of the deficit process over the delivery 
time interval. 

V. Approximate algorithm for dispatch 

In this section, we consider the continuous-time operation of energy storage and propose an 
approximate algorithm for estimating dispatch thresholds. Before introducing the continuous- 
time model, we first reformulate the discrete-time counterpart. Without loss of generality, we 
assume c = 1. Let 

t 

Vt= ^ [Dr - X + Ur] + , 

T = R+1 

t 

T = R+1 

denote the cumulative VOLL cost and cumulative curtailment up to time t, respectively. Suppose 
that Vt and Qt, t E T, are adapted to information Yt, t E T. Then the charging and discharging 
operation of energy storage is uniquely specified by 

[ (Vt - Vt^,) + (Qt - Qt-i) -{Dt-x) if t > i? + 1, 

Ut= < 

Vr+i + Qr+1 - -x) if t = + 1, 
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and is also adapted to Ft, t G T. The stored energy can also be expressed in terms of Vt and 
Qt-. 

bt+i = bt + ut 

t 

= bn+i + ^ 

T=R+1 

t 

= bn+i - ^ (A - a;) + + Qt- 

T = R+1 

Now we can reformulate the optimal storage operation problem with Vt and Qt as control 
variables, that is, 

minimize E [Vr+^+i] 

t 

subject to bt+i = - ^ {Dr~x) + Vt + Qt, 

T=R+1 

< bt+i < B, 

Vt>Vt-i>...> Vr+i > 0, 
Qt < Qt-i <...< Qr+1 < 0, 

{Vt,Qt) = uyt)- 

Although the feasible set allowing {Vt — Vt-i){Qt — Qt-i) < is larger than the feasible set of 
the original problem, it is easy to see that the alternative control variables 

Vt = Vt- mm{Vt - Vt-x, -{Qt - Qt-x)}, 

Qt = Qt + mm{Vt - Vt.,, -{Qt - Qt-i)}, 

yield the same stored energy 64+1 and lower cost. Thus, the reformulated optimization problem 
is equivalent to the original problem. Under the optimal policy in Lemma IIV. 1 [ the cumulative 
VOLL cost Vt increases only if storage is empty, that is, bt+i = 0, and the cumulative curtailment 
Qt increases only if storage is full, that is, bt+i = B. 

With the above reformulation, we are ready to introduce the continuous-time model. Assume 
that the delivery time is a continuous -time interval Tc '■= [-R + 1, -R + T + 1]. Assume that 
given information set Ir+i, the cumulative net deficit process D is a {D{Yr^i)/T,or-^i/ \/T) 
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Brownian motion, that is, Dt is a Gaussian random variable with mean D(Yji^i)(t—R~l)/T and 
variance aj^_^_^{t — R—1) /T, where the adjustment in t is due to the starting time. The cumulative 
VOLL cost Vt is adapted to information Yt, continuous, and non-decreasing with Vr+i = 0. The 
cumulative curtailment Qt is adapted to information Yt, continuous, and non-increasing with 
Qr+i = 0. Then the stored energy at time t is equal to 



for t E Tc- Under the optimal policy, Vt increases only if 6t = 0, and Qt decreases only if 
bt = B. The stored energy process bt is a reflected Brownian motion. We will approximate the 
total VOLL cost by the product of the long-term average VOLL cost and the delivery interval 
length. To find the long-term average cost, we use the properties of reflected Brownian motion 
in the following Lemma. 

Lemma V.l ( [|29ll ). Let Zt be a a) Brownian motion with Zq = and 



be a reflected Brownian motion in [0, B] such that Vt and Qt are adapted to the flltration induced 
by Zt and satisfy 

1) Vt is continuous and non-decreasing with Vq = 0, 

2) Vt increases only when bt = 0, 

3) Qt is continuous and non-increasing with Qq = 0, and 

4) Qt decreases only when bt = B. 

The long term averages of Vt and Qt are equal to 



bt = bn+i 



[Dt -x{t-R- 1)] + Vt + Qt 



bt = Zt + Vt + Qt 
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respectively. The steady-state probability density function of Zt is equal to 

fz{z) = 
forO<z<B. 

Using the above lemma, we can approximate the VOLL cost for general c and its first-order 
derivative by 

Jr+i{xr+i) = cE[Vt] ^ cT lim ^E[l^i] (16a) 



\^R+l 



2B 



VJr+i{xr+,) ^ ch' { - D{Yr+i)) ] . (16b) 



where 



h{x) 



h'{x) 



if X ^ 0, 




if X = 0. 

2 



Remark \.2. Formulae (fT6l ) reveal the role played by storage explicitly. An important observation 
is that scaling B and cr|.+i by the same constant does not affect Jr+i{xr-j^i) and its derivative. 
That is, a system with more fluctuate wind ( deeper penetration ) and large storage can have the 
same terminal cost and thus dispatch thresholds with the another system with less fluctuate wind 
and small storage, given the ratio B ja^j^^ is fixed. This quantifies the notion " storage firms the 
wind" in the context of dispatch. 

The approximate VOLL cost is convex. Thus, the approximate dispatch policy is still charac- 
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terized by (01) except that the thresholds ip^ = Ar + D{Yr) and satisfies 

Cr = Cr+l (1 - f{Ar > A^+i + ej) 

+ Cr+2 (P{A, > A,+i + e,} - P{A, > A,+i + e„ A, > A,+2 + e^+^j) + 

+ CR(F{Ar > Ar+1 + er, • • • , A, > Afi_i + 

- P{A, > A,+i + e„ . . . , A, > A^ + ef-i}) 

2B 



- cE 



1 



{A,.>Ar+l+er, 



h' 



a 



-(A, 



VI. Numerical results 

A. Setup 

We utilize the published forecast performance curve from Red Electrica Espana ( Fig. |4 |[a)[ ) 
to compare the costs of various dispatch policies. Let a{t) be the standard deviation of the t- 
hours-ahead forecast error. The error explained from stage r — 1 to stage r is assumed to be a 
Gaussian random variable with zero mean and variance = cr(tr-i)^ — (T(tr)^- We assume that 
at t = 0.25, the forecast error of the mean of the deficit contributes 20% of the error variance, 
and thus crji^i = 0.8cr(0.25)^. For the discrete-time model, the number of storage operation time 
intervals is |T| =60. 

We consider a 3-stage dispatch with day-ahead, hour- ahead, and 15-minutes-ahead stages. The 
prices of purchasing energy are suggested by published average energy prices in California. We 
set the day-ahead price to $52 per MWh, the hour-ahead price to $60 per MWh, the 15-minutes- 
ahead price to $72 per MWh, and the VOLL to $1000 per MWh. The mean of the deficit D is 
normalized and is between —1 and +1. For a policy 0, we will estimate the cost J(f,{D,B) by 
2000 Monte Carlo runs of forecast errors. 



B. Comparing dispatch approaches 

In addition to the optimal dispatch policy in Section |IV] and the approximate algorithm in 
Section |Vl we also consider the following two dispatch approaches as benchmarks. The Sa-rule 
assumes A^ = 'ia{tr). The ideal policy is the optimal dispatch given a perfect forecast and is 
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Time horizon (hour) a^fl+i 



(a) Forecast error curve (b) Storage operation costs 

Fig. 4: Illustration of the forecast error curve from Red Electrica Espana and the storage operation 
costs for the discrete-time and continuous-time models. 



given by the following linear program, 

minimize ciXi + c ^^[D^ — x + Ut] + 

teT 

subject to bt+i = h + Ut, 

-bt<Ut<B-bt. 

Since a perfect forecast is available in this case, it is always optimal to make all purchase at 
the day-ahead market when the price is lowest. Denote the cost of the ideal policy by Jo{D, B). 
For any policy (p, J^{D,B) > Jq(D,B), and the difference is the integration cost of policy (p: 
Cj = J^{D,B)-Jo{D,B) a. 

Fig. |4 |{b)| shows the storage operation costs for the discrete-time model and the approximate 
continuous-time model. The approximate model overestimates the storage operation cost for small 
storage capacity since the discrete-time model does not consider the cost caused by the variation 
within each time interval. For large storage capacity, the continuous-time model underestimates 
the storage operation cost since it assumes that the probability distribution of the initial stored 
energy is steady-state distribution instead of zero assumed in the discrete-time model. 

In Fig. I 3{a)[ we compare the cost J^{D, B) of the Sa strategy, the optimal dispatch policy. 
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(a) Costs (b) Integration costs 

Fig. 5: The costs and integration costs for the 3cr strategy, the optimal dispatch policy, the 
approximate policy, and the ideal policy for B = 0.001. 
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Fig. 6: The costs of the optimal policy and the approximate policy for D = 0.4. 



the approximate policy, and the ideal policy for B = 0.001. The 3a strategy has the highest 
cost. The cost of the approximate policy is slightly higher than the optimal cost. The integration 
costs with respect to the cost of the ideal policy are shown in Fig. [ ^b)[ 

Fig. |6] shows the cost /^(-D, B) of the optimal dispatch policy and the approximate policy for 
D = 0.4. As we observe in Fig. |4bl the approximate model is not suitable for very small and 
very large storage capacities and thus has higher costs in those regimes. 
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VII. Conclusion 

The paper extends Risk Limiting Dispatch to incorporate fast ramping storage. The structural 
properties of the optimal dispatch are studied, and demonstrate that optimality is achieved by 
following a simple threshold rule. The optimal storage operation policy is given in closed form. 
Explicit formulae for evaluating the total expected cost over the delivery interval are obtained 
and efficient algorithms for computing the dispatch threshold using this cost estimates are also 
obtained. 

A simpler continuous time approximation to storage operation results in a simple expression for 
terminal cost-to-go as a function of the storage capacity B and the deficit process variance cr|;+i. 
The relationship quantifies the notion that the storage smoothes the wind. The algorithms are 
illustrated and compared using numerical results. In ongoing work, we are extending the method 
to include slow storage, which requires modeling multiple simultaneous market decisions. We 
would also like to investigate incorporation of ramping constraints for the slow storage problem. 
Finally, the effects of network congestion in a scenario with storage can be considered. 
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Appendix A 
Proof for Section [m] 

Proof of Proposition IIII.2I 

We first state and prove a standard result. 

Proposition A.l ( [|30ll ). Let X be a nonempty set with a nonempty set for each x & X. Let 
C = {{x,y) : y G Ax,x G X}, let J be a real-valued function on C and define 

f{x) = M{J{x,y) -.y e A^} ,x e X. 

If C is a convex set and J is a convex function on C, then f is a convex function on any convex 
subset of X* = {x : X E X, f{x) > — oo}. 
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Proof: Pick xi and X2 in X* so /(xi) > —00, /(X2) > —00. Then for all 7 > there are 
Ui and ?/2 with {xi,yi) E C, i = 1 and 2, such that /(xj) + 7 > J{xi,yi). Pick t G (0, 1), and 
let {x,y) = + (1 — t){x2,y2), which is in C because C is convex. Now 

+ (1 - t)f{x2) > tJ{xu yi) + (1 - t)J{x2, 1/2) - 7 > ^(a;, - 7 > /(a;) - 7, 
with the second inequality due to convexity of J on C. Letting 7 — )■ yields the convexity of 

/■ ■ 

Given g{'D,u,x) convex in u and x, we have E {(yf(D, m, x)|Yr+i} convex in u and x. Since 
V((h) is an affine set (and therefore is convex), by Proposition lA.ll we have Jr+i{xr+i) convex 
in xr+i. 

Suppose Jr+i{xr+i) is convex in Xr+i- We proceed to prove Jr{xr) is convex in x^. Note 

Jr{Xr) = inf E {CrSr + J^+l (^^r+l) |^r} • 

Since 

is convex in Sr and x^, and the conditional expectation preserves the convexity, by invoking 
Proposition IA.1[ we have Jr{xr) is convex in Xr- 

Therefore by induction, we have Jr{xr) convex in Xr for all r G 7^ U {i? + 1}. 

Proof of Theorem IIIL41 

By Proposition IIII.21 we have Jj.{xr) is convex for r G 1Z\{R + 1}. Further — c,. G V Jr+i(^r) 
by the definition of constrained subgradient. Since Jr+i is y^-adapted, we have ipr is Fr-adapted. 
Given 

we have 

CrX + Jr+l{x) > Cr-Tpr + «/r+l (V'r) ■ 

Therefore 

if Sr G Sr- This relation gives the optimal threshold for stages r E TZ. For purchase only or 
sell only stages, we show Equation & gives the optimal dispatch. Consider a purchase stage, if 
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ipj. — Xr > 0, the constraint is not tight and s* = ipr — Xr- Otherwise, we show s* = 0. Suppose 
s* = s* 7^ 0, there exists < a < 1 such that as* + (1 — a){i>r — Xr) = 0. However, given 

Jr{Xr, Sr) = E [c,.Sr + Jr+l{Xr + Sr)\Yr] COnVCX in Sr, WC haVC 

Jr{Xr, §*) < Jr{Xr, 0) = Jr{Xr, OS* + (1 — a)('?Ar ~ Xr)) < aJr{Xr, S*) + (1 — a)Jr{Xr, Ipr — Xr), 

where the first inequality is based on the assumption that s* is a minimizer of Jr{xr, Sr) while 
is not, the last inequality is due to the convexity. Consequently 

— Xr), 

which is clearly a contradiction since — Xr is the minimizer to the unconstrained problem. 
Therefore s* = [ijjr — Xr]+. Similarly for the sell only stage, s* = [ipr — Xr]-. 

Appendix B 
Proof for Section [IV] 

Proof of Lemma IIV.!! 

We need to prove the convexity of the cost-to-go function and the optimality of the proposed 
control rule. 

. Notice jR+T+iix,bR+T) is convex in {x,bR+T)- 

Suppose Jt+iix,bt+i) is convex in (x, fo^+i). We proceed to prove Jt{x, Of ) IS convex in 
(x, bt). Note 

Jt{x,bt)= inf E{c[Dt-x + Ut\+ + Jt+iix,bt+i)\Yt}. 

uteu(bt) 

Since 

[Dt- x + ut]+ + Jt+i (^x, A (^t + l^[ut]+ + ^[Wil- 
is convex in {x,bt,Ut) , and the conditional expectation preserves the convexity, by invoking 
Proposition lA.ll we have Jt{x,bt) is convex in {x,bt). 

Therefore by induction, we have Jt{x, bt) is convex in {x,bt) for all t E T LI {R + T + 1} . 
• The optimal control policy of storage is a standard result. See Remark 4.3 in [|T8l for 
intuitional explanation, and IITtI for detailed proof. 
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Proof of Corollary ITO 

We prove this result by induction over the length of the storage operation problem. For the base 
case, let T = 1, we have 

Jr+i{xr+i]T = 1) = cE{[L)r+i - , 
bR+i = 0, and = 5 A [a: - A + h^R+il 

Suppose the expression for Jr+i{xrj^i) and h* holds for T = I, and consider the case T = l + l. 
By the optimal storage control rule, we have 

Jr+i{xr+i\ T = / + 1) = cE I Jii+i(x/j+i; T = I) + - x- 

Invoking the induction hypothesis on the recursive formula for the sequence 6^, for T = I, whose 
last term gives b*R_^_i_^_^, we have 



Jr+i{xr+i; T = l+l) = cE< Jr+i{xr+i] T = l)+ Dr+i+i - x- B A [x - Dr+i + bR+i] 



Plugging in the expression of jR+i{xR+i]T = I) yields the desired result on jR+i{xR+i), and 
the expression of &/j+i+2' which is the only additional term in the sequence b^, for T = / + 1, 
follows from the optimal storage control rule. 



Proof of Proposition IIV.3I and Theorem IIV.4I 

For the sake of the limited space, we prove Proposition IIV.3I in the context of Theorem IIV.4I 
The general proof of Proposition IIV.3I can be done similarly by induction. Consequently there 
are two items to prove: 

• Proposition IIV.3I holds, i.e., Jr+i{xr+i) in Corollary IIV.2I can be expressed as 



•Jr+i{xr+i) = cE 



Yr 



j:j:[Dt-xup', 

.teT k(^Kt 

Proof: Equivalent to the form in Corollary IIV.21 we have 



Jr+i{xr+i) = cE< [Dr+1 - x]+ + [Dr+2 - X - B A[x - Dr+i]^] 



+ 



+ 



Dr+3 -x-Ba[x- Dr+2 + BA[x- Dr+i]^] 



+ ... (17) 



Dr+t -x-B a 



X 



Dr+t-i + BA [...[x- Dr+^]^...]_ 
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Denote the expected penalty that will occur at stage t eT as Vt, i.e.. 



Vf = cE< 



Df-x-BA 



X 



A-1 + 5A [...[x-Dr+,]^...] 



+ 







'•1 




+ 





We have Jr+i{xr+i) = Y.t&T ^t- We then need to prove Vt = c^l Y.keK.t\.^t-A+P\ 



I.e. 



I k&Kt 



Yr 



(18) 



Df-x-B A 



X 



-Dt-i + B a[...[x-Dr+i]^...] 



Yr 



which will prove the expression for Jr^\{xr^\) by summing up terms corresponding to 
each storage operation stages. By observation, the equation above holds if under the event 
indicated by ■p\ we have 



Dt-BK 



X 



t ■ 



We prove this statement by induction. The base case holds since ICr+i — {1} , p^+i = 1 
and D]^_^_^ = -Dr+i. Suppose the result holds for stage t — 1 and consider the stage t. If 
the event indicated by p] holds, i.e., one of the Kt_i pairs of events, indicated by p[_i and 
l{D'j._^ > x}, hold simultaneously, we have 



Df-BA 



X 



Dt-i + BA [...[x-Dr+i]^...] 



J + 



^Dt-B A[x- £>t_i]+ ^ Dt- B AO ^ Dl 

The first equality is due to p[_-^ = 1 and the induction hypothesis. The second equality is 
due to l{Dl_^ > x} = 1 and the last equality is the definition of Dl. 
Similarly, if the event indicated by pf^* holds, , i.e., one of the Kt-i pairs of events, indicated 
by and l{D'j._-^ < x — B}, hold simultaneously, we have 



Dt-BA 



X 



-Dt-i + BA [...[x-Dr+i]^...] 



+ 



Dt-BA[x- Dl,]+ ^Dt-B^ DfK 
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If the event indicated by holds, k G ICt\{l, Kt}, 



Dt-BA 



X 



Dt-i + BA[...[x-Dn+i]^...]_ 



Dt-BA[x- D^I,% = Dt-[x- = D 



As a consequence, Equation (1181) holds for any t E T, which completes the proof. 
Notice the second equality in Equation (|7]) is another form for the same result, as the set 
of inequality denoted by Pt can always be expressed as the vector inequalities. For general 
deficit process, this form is not useful in term of computation (and therefore we don't derive 
further the expression for upper and lower bounds involved). But in the setup of independent 
error, this forms gives computational efficient way of evaluate the thresholds as explained 
in Section HV-Cl ■ 
The constrained subgradient for the cost-to-go function at stage R + 1 given the information 
at a dispatch stage r is 



V4+i(x«+i) = -^E I 5^ 5^ [fc A (ir, + 1 - k)]p'tHD^ > x} 

Iter fce/Ct 



Yr 



Proof: Using the explicit definition of Dt, we notice 



dD^ \-ik-l) ifl<k<t-R, 



dx 



Consequently, 



d[D^ - x] 
dx 



-{Kt-k) ift-R<k<Kt. 

-k ifl<k<t-R, 
-{Kt + l-k) if t - R < k < Kf 



or more concisely 

Invoking the chain rule and Leibniz's rule for differentiation under the integral sign finishes 
the proof. ■ 

Proof of Proposition IIV.SI 

By induction over the levels of the lattice. The base case p]l_^_l = 1 holds trivially. Suppose the 
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expressions for p^, and p| hold for all t < / and k G ICt. Consider the corresponding 



probabilities in level /. The correctness of Equation (fTTI) follows from the definition of pf_i, 
and p^_i. Referring to the algebraic recombinant lattice, we notice, for A; = 1 (or k = Ki) 



Pi 



P(visit node (/, k) and then move left) 

P(visit node (/, A;))P(move left from node (/, k)) 



Here the last equality follows from the fact that e^^ = ef^' = ei is independent from all the past 



errors. Further, this result agrees with the form in Equation (1121) because h = for k = 1 
(or k = Ki). Now consider k E )Ci\{l, Ki}. By Equation (flOl ). ef is independent with ej for 
j < l — h, where h = (fc — 1) A {Ki — k) . Thus similar to above, we can break the joint probability 
into product by the independence: 



Pi 



(visit node (/, k) and then move left) 
(visit node (l — h, k — h)) ■ 

(starting from node (I — h, k — h), visit node (/, A;) and then move left from it) 



The second term in the last line follows from the observation that k — h = l when k < (Kt + l)/2 
and k — 1 = Kt-h otherwise. That is, on the lattice, node {I — h, k — h) is always a "boundary 
node" that is corresponding to the state either the storage is full or empty. Further, starting 
from such a node, the only path to node (/, A;) is by moving to the middle child recursively 
h times. The exact same reasoning holds for p^ and pf by replacing the last inequality on ef 
correspondingly. Thus we have proved Equations (fTT)) . (fT2)) . (fT3l) and (fT4)) hold inductively. Note 
all the bounds in the inequalities involved are due to Corollary IIV.3[ with the predicted deficit 
term (See Equation dH)) plugged in. 



Proof of Lemma II'V.61 

For the terminal cost-to-go, by Equation ([7]), 



Jr+i{xr+i) =cE 



.teT kaKt 



A+P^t 



cE 



.fer feG/Ct 
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E { [A' + E [6,^|i;, < Et-/ < E^, e,^ > ^] _ a;] ^} . 

For the subgradient, notice Jr+i{xr+i) is a continuous function (see (flTl) ). Given the Gaussian 
prediction error, it follows all the terms due to differentiating the integrating limits cancel. The 
remaining terms are given in (fT5l) . 
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